1,485 research outputs found
Attention Correctness in Neural Image Captioning
Attention mechanisms have recently been introduced in deep learning for
various tasks in natural language processing and computer vision. But despite
their popularity, the "correctness" of the implicitly-learned attention maps
has only been assessed qualitatively by visualization of several examples. In
this paper we focus on evaluating and improving the correctness of attention in
neural image captioning models. Specifically, we propose a quantitative
evaluation metric for the consistency between the generated attention maps and
human annotations, using recently released datasets with alignment between
regions in images and entities in captions. We then propose novel models with
different levels of explicit supervision for learning attention maps during
training. The supervision can be strong when alignment between regions and
caption entities are available, or weak when only object segments and
categories are provided. We show on the popular Flickr30k and COCO datasets
that introducing supervision of attention maps during training solidly improves
both attention correctness and caption quality, showing the promise of making
machine perception more human-like.Comment: To appear in AAAI-17. See http://www.cs.jhu.edu/~cxliu/ for
supplementary materia
Few-Shot Image Recognition by Predicting Parameters from Activations
In this paper, we are interested in the few-shot learning problem. In
particular, we focus on a challenging scenario where the number of categories
is large and the number of examples per novel category is very limited, e.g. 1,
2, or 3. Motivated by the close relationship between the parameters and the
activations in a neural network associated with the same category, we propose a
novel method that can adapt a pre-trained neural network to novel categories by
directly predicting the parameters from the activations. Zero training is
required in adaptation to novel categories, and fast inference is realized by a
single forward pass. We evaluate our method by doing few-shot image recognition
on the ImageNet dataset, which achieves the state-of-the-art classification
accuracy on novel categories by a significant margin while keeping comparable
performance on the large-scale categories. We also test our method on the
MiniImageNet dataset and it strongly outperforms the previous state-of-the-art
methods
Secure Transmission for Relay Wiretap Channels in the Presence of Spatially Random Eavesdroppers
We propose a secure transmission scheme for a relay wiretap channel, where a
source communicates with a destination via a decode-and-forward relay in the
presence of spatially random-distributed eavesdroppers. We assume that the
source is equipped with multiple antennas, whereas the relay, the destination,
and the eavesdroppers are equipped with a single antenna each. In the proposed
scheme, in addition to information signals, the source transmits artificial
noise signals in order to confuse the eavesdroppers. With the target of
maximizing the secrecy throughput of the relay wiretap channel, we derive a
closed-form expression for the transmission outage probability and an
easy-to-compute expression for the secrecy outage probability. Using these
expressions, we determine the optimal power allocation factor and wiretap code
rates that guarantee the maximum secrecy throughput, while satisfying a secrecy
outage probability constraint. Furthermore, we examine the impact of source
antenna number on the secrecy throughput, showing that adding extra transmit
antennas at the source brings about a significant increase in the secrecy
throughput.Comment: 7 pages, 5 figures, accepted by IEEE Globecom 2015 Workshop on
Trusted Communications with Physical Layer Securit
- …